21 research outputs found

    PropBase QueryLayer: a single portal to UK physical property databases

    Get PDF
    Until recently, the delivery of geological information for industry and public was achieved by geological mapping. Now pervasively available computers mean that 3D geological models can deliver realistic representations of the geometric location of geological units, represented as shells or volumes. The next phase of this process is to populate these with physical properties data that describe subsurface heterogeneity and its associated uncertainty. Achieving this requires capture and serving of physical, hydrological and other property information from diverse sources to populate these models. The British Geological Survey (BGS) holds large volumes of subsurface property data, derived both from their own research data collection and also other, often commercially derived data sources. This can be voxelated to incorporate this data into the models to demonstrate property variation within the subsurface geometry. All property data held by BGS has for many years been stored in relational databases to ensure their long-term continuity. However these have, by necessity, complex structures; each database contains positional reference data and model information, and also metadata such as sample identification information and attributes that define the source and processing. Whilst this is critical to assessing these analyses, it also hugely complicates the understanding of variability of the property under assessment and requires multiple queries to study related datasets making extracting physical properties from these databases difficult. Therefore the PropBase Query Layer has been created to allow simplified aggregation and extraction of all related data and its presentation of complex data in simple, mostly denormalized, tables which combine information from multiple databases into a single system. The structure from each relational database is denormalized in a generalised structure, so that each dataset can be viewed together in a common format using a simple interface. Data are re-engineered to facilitate easy loading. The query layer structure comprises tables, procedures, functions, triggers, views and materialised views. The structure contains a main table PRB_DATA which contains all of the data with the following attribution: • a unique identifier • the data source • the unique identifier from the parent database for traceability • the 3D location • the property type • the property value • the units • necessary qualifiers • precision information and an audit trail Data sources, property type and units are constrained by dictionaries, a key component of the structure which defines what properties and inheritance hierarchies are to be coded and also guides the process as to what and how these are extracted from the structure. Data types served by the Query Layer include site investigation derived geotechnical data, hydrogeology datasets, regional geochemistry, geophysical logs as well as lithological and borehole metadata. The size and complexity of the data sets with multiple parent structures requires a technically robust approach to keep the layer synchronised. This is achieved through Oracle procedures written in PL/SQL containing the logic required to carry out the data manipulation (inserts, updates, deletes) to keep the layer synchronised with the underlying databases either as regular scheduled jobs (weekly, monthly etc.) or invoked on demand. The PropBase Query Layer’s implementation has enabled rapid data discovery, visualisation and interpretation of geological data with greater ease, simplifying the parameterisation of 3D model volumes and facilitating the study of intra-unit heterogeneity

    A solution (data architecture) for handling time-series data - sensor data (4D), its visualisation and the questions around uncertainty of this data

    Get PDF
    Geo-environmental research is increasingly in the age of data-driven research. It has become necessary to collect, store, integrate and visualise more subsurface data for environmental research. The information required to facilitate data-driven research is often characterised by its variability, volume, complexity and frequency. This has necessitated the development of suitable data workflows, hybrid data architectures, and multiple visualisation solutions to provide the proper context to scientists and to enable their understanding of the different trends that the data displays for their many scientific interpolations. However this data, predominantly time-series (4D) acquired through sensors and being mostly telemetered, poses significant challenges/questions in quantifying the uncertainty of the data. To validate the research answers including the data methodologies, the following open questions around uncertainty will need addressing, i.e. uncertainty generated from: • the instruments used for data capture; • the transfer process of the data often from remote locations through telemetry; • the data processing techniques used for harmonising and integration from multiple sensor outlets; • the approximations applied to visualize such data from various conversion factors to include units standardisation The main question remains: How do we deal with the issues around uncertainty when it comes to the large and variable amounts of time-series data we collect, harmonise and visualise for the data-driven geo-environmental research that we undertake today

    Management of 3D geological models at the British Geological Survey

    Get PDF
    The British Geological Survey (BGS) has been building digital 3D structural geological models for around 20 years. Today, we have many models, from local to national scale, which together comprise the National Geological Model[1]. The National Geological Model is constantly evolving and being extended and refined by a range of projects. Depending on the type of model (quaternary, bedrock), the geological complexity, the scale, and the nature and distribution of available input data (e.g. boreholes), these models are built using a range of methods. These include 1) the construction of interlocking networks of interpreted cross-sections and related subsurface coverage maps, 2) CAD-based geo-object modelling in a 3D scene, and 3) geo-statistical implict/numerical models. This heterogenous approach to model building allows the geologist to apply the best, most pragmatic method to the project at hand. However, this creates challenges for the systems developer who must seek to archive and manage the model data in a consistent and standardized form

    Oracle Spatial in British Geological Survey

    Get PDF
    BGS has been using Oracle Spatial to generate geometries for point data sets; these geometries are subsequently processed into ESRI ArcGIS layers, and also queried directly by various applications using Oracle Spatial functions. This talk will illustrate this work by way of examples and will move on to consider the data management issues encountered when modifying long-established data models. The underlying base tables are generally not modified, but the spatial geometries are added via related attribute tables which in turn are kept up to date using a combination of techniques including PL/SQL procedures, views, materialized views and function-based indexes

    Open Geoscience Data Models: end of project report

    Get PDF
    This report describes a three year knowledge exchange project, OpenGeoscience Data Models, which was funded by the Natural Environment Research Council (NERC) Knowledge Exchange programme. The project was aimed to encourage an open sharing of geoscience data models amongst a community of Geological Survey Organisations (GSOs), industry and academics. The data model is a key part of successful information management because it provides a centralised description of the meaning and inter-relationships of the information

    PropBase “Warehouse” architecture

    Get PDF
    PropBase is a “data warehouse” system that extracts, transforms and loads data into a simplified data model from across BGS’s heterogeneous property data sources into a single view so that the data is compatible and accessible from a single interface. The system consists of: data tables that form the core of a simplified data structure; coding routines that are run at regular intervals for the extraction, transformation and load of data into the simplified data structures, a second tier partitioned denormalized data access layer that serves as the data access point by applications. The system also includes a suite of java coded search utilities that facilitate easy data discovery and download to allow for the complex synthesis of many data types simultaneously. ; There is also a web service to allow for machine‐to‐machine interaction, enabling other software systems to directly interrogate the datasets to visualise and manipulate them. This system will have a significant impact by allowing multiple datasets to be rapidly integrated for scientific understanding whilst ensuring that data is properly managed and available for future use

    Big data in the geoscience: a portal to physical properties

    Get PDF
    Geosciences were early adopters of both computing and digital data; the precursors of the SEG-D and SEG-Y geophysical formats date from as far back as 1967. Data standards, for seismic (SEG-Y, SEG-D) or geophysical log (LAS, DLIS) data simultaneously make interpretation and visualisation of data practicable but also their binary nature makes applying analytical techniques unusually complex. Specialist software is often required to process and interpret different datatypes. Such problems are exacerbated by historic poor data management practices. Datasets are rarely collated at the end of projects or stored with sufficient metadata to accurately describe them and many strategically useful datasets reach BGS incomplete, unusable or inaccessible. Whether this situation arose through a lack of foresight about the future value of data, poor practise or simply storage space restrictions these problems pose huge challenges to today’s geoscientists. Consequently, there are major problems with applying big data analytics to geoscience. For example, many techniques don’t sample geology directly but use proxies needing further interpretation. The use of analytical techniques have commonly been limited by the high proportion of noise incorporated into the datasets with very significant interpretation skills required to identify the signal. Thus far successful applications of “big data” analytics have been limited to closed systems or analyses of very common digital data types. Significant problems remain, including the lack of data that can be immediately interacted with and difficulties in bringing together multiple datasets about related phenomena. Also the lack of adequate metadata about the data available to understand its context and scope and how to apply and qualify results. Whilst geosciences datasets have all the attributes of big data – volume, veracity, velocity, value and variety – the last two controls are disproportionately significant. The first of these determines the usefulness of the data and the second is the biggest impediment to delivering on the promises that big data offers especially in Earth Sciences. In order to deliver a standardised platform of data from which individual geological attributes can be identified BGS has invested in the creation of PropBase (Kingdon et al., 2016). This single portal facilitates the collation of datasets supplied in standardised formats. This allows all data from a single point feature (e.g. boreholes) or areas of interest) E.G. to be extracted together in a common format allowing all data to be immediately compared. The existence of PropBase portal allows a researcher to answer the question “What’s available at a location?” It has already been used in site characterisation for the UK GeoEnergy Observatories project. Such initiatives that allow collation of high volumes of data in a single extractable format are a critical step forward to allowing Big Data analytics. Combined with the increasing availability and ever lowering cost of high power computing and analytical routines, the opportunities for big data analytics are ever growing. However, substantial challenges remain and new and more interactions with computer scientists are needed to deliver on this promise

    Using OGC API standards for marine data delivery : MEDIN Pilot Project

    Get PDF
    This report is the published product of a pilot project by the British Geological Survey (BGS), commissioned by Marine Environmental Data and Information Network (MEDIN) to investigate the implementation of the OGC API – Environmental Data Retrieval (OGC API – EDR) standard to deliver data externally from the MEDIN Geology and Geophysics Data Archive Centre (DAC). The report details the method and technologies used in the attempt to deliver BGS DAC data using the OGC API – EDR standard. This report could be used as an OGC API implementation guide for other DACs and as guidance on contributing to existing Open-Source projects. The report also detail aspects of the OGC API – EDR that made it inappropriate to use for some marine environmental data, and any areas where the standard was ambiguous or challenging to implement. Also included are recommendations on how well the OGC API – EDR can support direct access to data for MEDIN, including which data types/formats can be supported. BGS did not have sufficient resource to deliver an OGC API – EDR endpoint at the completion of the project but did implement OGC API – Features with Common Query Language (CQL) endpoint which offers similar functionality but uses different syntax. OGC API – Features + CQL was meant to be an intermediary step to delivering OGC API – EDR. BGS were unable to deliver OGCI – EDR as after spending time exploring the standard in depth it was shown to be more complex than anticipated. However, through the process of this endeavour we have made incremental steps forward towards EDR, and we were able to make a major contribution to an Open-Source Project which will benefit many geospatial data publishers and users

    Methodology to sustain common information spaces for research collaborations

    Get PDF
    Information and knowledge sharing collaborations are essential for scientific research and innovation. They provide opportunities to pool expertise and resources. They are required to draw on today’s wealth of data to address pressing societal challenges. Establishing effective collaborations depends on the alignment of intellectual and technical capital. In this thesis we investigate implications and influences of socio-technical aspects of research collaborations to identify methods of facilitating their formation and sustained success. We draw on our experience acquired in an international federated seismological context, and in a large research infrastructure for solid-Earth sciences. We recognise the centrality of the users and propose a strategy to sustain their engagement as actors participating in the collaboration. Our approach promotes and enables their active contribution in the construction and maintenance of Common Information Spaces (CISs). These are shaped by conceptual agreements that are captured and maintained to facilitate mutual understanding and to underpin their collaborative work. A user-driven approach shapes the evolution of a CIS based on the requirements of the communities involved in the collaboration. Active users’ engagement is pursued by partitioning concerns and by targeting their interests. For instance, application domain experts focus on scientific and conceptual aspects; data and information experts address knowledge representation issues; and architects and engineers build the infrastructure that populates the common space. We introduce a methodology to sustain CIS and a conceptual framework that has its foundations on a set of agreed Core Concepts forming a Canonical Core (CC). A representation of such a CC is also introduced that leverages and promotes reuse of existing standards: EPOS-DCAT-AP. The application of our methodology shows promising results with a good uptake and adoption by the targeted communities. This encourages us to continue applying and evaluating such a strategy in the future
    corecore